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@ A semiconductor integrated circuit 

(57) Herein disclosed is a semiconductor inte- 
grated circuit capable of executing processing 
operations using two-dimensional data in a 
high parallelism and at a high speed. 

The semiconductor integrated circuit com- 
prises : a two-dimensional memory array 
(MAR) ; a parallel data transfer circuit (TRC) for 
transferring the data read out in parallel 
through data lines, in parallel to a processing 
circuit group by selecting the word lines of the 
two-dimensional memory array; and the pro- 
cessing circuit group (PE) for executing proces- 
sing operations in parallel by using the data 
transferred from said parallel data transfer cir- 
cuit Each of the processing circuits can make 
access to the plurality of series word lines of 
said two-dimensional memory array and the 
data lines through the parallel data transfer 
circuit and the data lines of the two-dimen- 
sional memory array, to which a plurality of 
adjoining processing circuits can make access, 
have an overlapped range. 

Since the data lines of the two-dimensional 
memory array, to which the adjoining proces- 
sing circuits can make access, have an overlap- 
ped range, the convolution processing 
operations or the like can be ex ecuted in 
parallel for the two-dimensional data stored in 
the two-dimensional memory array. 
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The present invention relates to a semiconductor 
integrated circuit using a two-dimensional memory 
array and, more particularly, to a semiconductor inte- 
grated circuit which permits the executing of either a 
digital filter processing operation such as a convolu- 5 
tion processing operation or a processing operation 
using two-dimensional data such as a search of the 
moving vector of a moving image, in real time. 

Of the various information processing operations 
handling two-dimensional data, the image process- 10 
ing operation has a two-dimensional array of pixels 
on a CRT display so that the two-dimensional data 
are frequently processed. These information proc- 
essing operations are represented by a two-dimen- 
sional filter processing operation. 15 

Fig. 2 shows a semiconductor integrated circuit of 
the prior art for processing an image. This device is 
suitable for the two-dimensional filter processing op- 
eration, as disclosed by Yoshiki Kobayashi, Tadashi 
Fukushima, Syuichi Miura, Morio Kanasaki and Koh- 20 
taro Hirasawa, "ABiCMOS Image Processor with Line 
Memories", ISSCC Digest of Technical Papers, pp. 
182- 183. Feb., 1987. 

Here will be summarized the semiconductor in- 
tegrated circuit of Fig. 2. As shown in Fig. 2(a), this 25 
semiconductor integrated circuit is constructed to 
comprise: a pre-processing circuit PPU for executing 
a pre-processing operation such as a threshold proc- 
essing operation of input image data; line memories 
LM1 and LM2 for storing images of one line to estab- 30 
lish a delay of one line; a shift register SR; a data 
memory DM for storing the weighting coefficient of a 
filter, a processing circuit PE; and linkage units LU1 
and LU2 including adders. Fig. 2B shows an example 
of the calculation of the case in which the semicon- 35 
ductor integrated circuit of Fig. 2A is used for calcu- 
lating the 3x3 space filter. In Fig. 2 B, reference char- 
acters F32 and F(x+i)(y+i) designate the (density) val- 
ue of a pixel of a third row and a second column in the 
frame of an input image, and the value of a pixel of a 40 
(x-H)-th row and a (y+i>th column, respectively. More- 
over, characters Wij, W-1-1 , - - -, and W11 designate 
filter coefficients, and characters Rxy designate the 
value of a pixel of an x-th row and a y-th column in the 
frame of the processed output image. The operations 45 
of the semiconductor integrated circuit of Fig. 2A will 
be described with reference to Fig. 2B. In the calcu- 
lation of 3x3 space filter, as well known in the art, the 
value of Rxy can be expressed by the summation of 
the products of the values of the pixels of the input irrv 50 
ages and the filter coefficients, as expressed by 
Equation in Fig. 2B. In order to determine the value 
of Rxy, there is required the values of the pixels of 
nine input images around the pixel of the x-th row and 
the y-th column in the frame of the input image. The 55 
image data inputted are inputted at first to the pre- 
processing circuit PPU. Since the filter processing 
operation needs no threshold processing, the input- 



ted image data are transmitted as they are to the shift 
register SR and the line memory LM1 . The output of 
the line memory LM1 is outputted with a delay of one 
line. The output of the line memory LM 1 is inputted to 
the line memory LM2 so that it is outputted with an ad- 
ditional delay of one line. As a result, the values of the 
pixels of the input image necessary for calculating the 
3x3 space filter are stored in different shift registers 
for the individual lines. Fig. 2B shows the status in 
which the values of the nine pixels of the input image 
around F22 are stored in the shift registers. The val- 
ues of the nine pixels stored in the shift registers are 
sequentially inputted to the processing circuits PE1, 
PE2 and PE3 so that their products with the correspond- 
ing coefficients are calculated. The resultant products 
are inputted to the linkage units LU1 and LU2 and are 
added so that the value of R22 is determined in this 
case. Thus, in the semiconductor integrated circuit of 
the prior art shown in Fig. 2, the values of the pixels over 
the three lines are inputted to the three processing cir- 
cuits by making use of the delays of the line memories 
so that the three multiplications are processed in par- 
allel. As a result, the space filter can be processed at 
a high speed. The aforementioned citation has report- 
ed that the BiCMOS device prepared for trial by the 
working technique of 1.8 microns could process the 
calculations of the 3x3 space filter on real time for the 
TV image composed of512x512 pixels. 

A first object to be achieved by the present inven- 
tion is to provide a semiconductor integrated circuit for 
performing the processing operations using two-di- 
mensional data in high parallelism, and a second 
problem is to integrate such a plurality of processing 
circuits in a high integration over a semiconductor 
chip as execute the processing operations in high par- 
allelism by using a two-dimensional memory cell ar- 
ray capable of massive two-dimensional data and the 
two- dimensional data. 

In the semiconductor integrated circuit of the prior 
art shown in Fig. 2, as has been described above, the 
calculations of the space filter are executed at a high 
speed by the nine multiplications necessary for calcu- 
lating one output pixel, three by three in parallel. For 
the future, however, the parallelism has to be en- 
hanced to increase the speed. 

As the quality of the image of a TV set, a work- 
station, a personal computer or a game machine ad- 
vances to the higher level, the number of pixels of one 
frame increases so that the frequency of the pixels is 
increased to the higher range. Moreover, it is antici- 
pated that the portable devices having the communi- 
cations and displaying functions are widely used in 
the near future. It is also anticipated that such device 
has to effect a clear displaying by variously process- 
ing the data of the moving image received by the com- 
munications function. In this device, a battery having 
a low voltage is mounted as a power source to drive 
the device. Generally speaking, however, the speed 
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of the semiconductor integrated circuit drops substan- 
tially proportionally to the drop of the supply voltage 
so that the semiconductor integrated circuit of the pri- 
or art may be unable to achieve a sufficient process- 
ing speed, In order to solve this, the parallelism has 5 
to be raised to prevent the drop in the processing 
speed. It is therefore desired to provide a semicon- 
ductor integrated circuit which has a higheV parallel- 
ism and which can process two-dimensional data at 
a high speed. 10 

In the device for handling an image, moreover, 
here is used the so-called "image memory" for storing 
the data of at least one display so as to simultaneous- 
ly perform the formation and processing of an image 
by the CPU and the drawing of the image in the CRT. is 
It is contributable to the reduction of the size of a de- 
vice for handing an image, especially a portable de- 
vice to integrate the device, which executes the proc- 
essing operations of the image memory and the two- 
dimensional data in high parallel, in a common sem- 20 
(conductor chip. 

According to a representative embodiment of the 
present invention, a semiconductor integrated circuit 
comprises: a memory cell array (MAR) including a plu 
rality of data lines (DG), a plurality of word lines (W1 25 
to W3) intersecting the plurality of data lines (DG), 
and a plurality of memory cells disposed at desired in- 
tersections between the plurality of data lines (DG) 
and the plurality of word lines W1 to W3); a parallel 
data transfer circuit (TRC) for transferring a plurality 30 
of data in parallel from the plurality of data lines (DG); 
and a plurality of processing circuits (PE1 to PEn) for 
receiving the plurality of data transferred from the 
parallel data transfer circuit (TRC), as their input sig- 
nals, 35 

characterized: in that the parallel data transfer 
circuit (TRC) is enabled to transfer two or more of the 
plurality of data to the individual ones of the plurality 
of processing circuits (PE1 to PEn) by sequentially 
selecting and selecting two or more of the plurality of 40 
data lines (DG) with the individual ones of the plurality 
of processing circuits (PE1 to PEn); and in that the ad- 
joining ones of the plurality of processing circuits 
(PE1 to PEn) can input the same data from the same 
data lines. 45 

Since the ranges of the data lines of the two-di- 
mensional memory arrays for the adjoining process- 
ing circuits to make access to are overlapped, it is 
possible to execute a filter processing operation of an 
image by calculating the value of a pixel from the val- so 
ue of a pixel neighboring the former pixel. In the 3x3 
filter, for example, the two-dimensionally distributed 
surrounding 3x3 input pixels are required for achiev- 
ing the result of one output pixel, and the filter proc- 
essing operation in the line direction can be executed 55 
by inputting the adjoining pixels on the same line to 
one processing circuit. If, moreover, the processing 
circuit is designed to execute the processing opera- 



tion by the use of a plurality of data groups read out 
to one of the aforementioned plurality of data line 
groups by selecting two or more of a plurality of word 
lines, the filter processing operation can be executed 
by inputting that one of the 3x3 input pixels, which is 
perpendicular to the line direction, to one processing 
circuit. As a result, the filter processing operation can 
be executed by inputting the 3x3 pixels to one proc- 
essing circuit Since, moreover, the adjoining proc- 
essing circuits have an overlap of the ranges of the 
data lines to which they can make access through the 
parallel data transfer circuit, the convolution process- 
ing operation and the processing operation using the 
two-dimensional data of the 3x3 filter or the like can 
be processed in parallel by the plurality of processing 
circuits. 

In the drawings: 
[Fig. 1] 

An embodiment showing the construction 
(i.e., the 3x3 space filter) of a semiconductor in- 
tegrated circuit according to the present inven- 
tion. 
[Fig. 2] 

A semiconductor integrated circuit of the pri- 
or art using a line memory. 
[Fig. 3] 

An embodiment showing the construction 
(i.e., the 5x5 space filter) of a semiconductor in- 
tegrated circuit according to the present inven- 
tion. 
[Fig. 4] 

An embodiment showing a first construction 
for loosening the layout pitch of the processing 
circuit in the embodiment of Fig. 1. 
[Fig- 5] 

An embodiment showing the construction of 
a parallel data transfer circuit in the embodiment 
of Fig. 4. 
[Fig. 6] 

An embodiment showing a method of con- 
trolling the parallel data transfer circuit in the em- 
bodiment of Figs. 4 and 5. 
[Fig. 7] 

A second embodiment showing the con- 
struction for loosening the layout pitch of the 
processing circuit in the embodiment of Fig. 1 . 
[Fig. 8] 

An embodiment showing the construction of 
a parallel data transfer circuit in the embodiment 
of Fig. 7. 
[Fig- 9] 

An embodiment showing a method of con- 
trolling the parallel data transfer circuit in the em- 
bodiment of Figs. 7 and 8. 
[Fig- 10] 

An embodiment showing the construction of 
a moving vector processing circuit using the pres- 
ent invention. 
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[Fig. 11] 

An embodiment showing the construction of 
a minimum distance processing circuit in the em- 
bodiment of Fig. 10. 

Fig. 1 shows an embodiment of a semiconductor 5 
device according to the present invention, that is, a 
construction of a device for processing the image 
data, which are inputted on real time, with a 3x3 
space filter. In Fig. 1, there are shown not only the 
construction of the present embodiment but also the w 
correspondences between the pixels of an image 
frame and the contents of memory cells in the device 
as well as a method of controlling a parallel data 
transfer circuit. According to the present embodiment, 
the space filters of one line of the frame of an output 15 
image can be processed in parallel. The present em- 
bodiment is constructed, as shown, to comprise: a 
serial access memory SAM1 for storing input pixels 
Fxy of one line and writing them in parallel in a two- 
dimensional memory array MAR; the two-dimension- 20 
al memory array MAR for storing the values of the pix- 
els of three lines, which are outputted from the serial 
access memory SAM1 ; a sense amplifier SAfor read- 
ing the values of the pixels of one line of the two-di- 
mensional memory array MAR in parallel and latching 25 
them; a parallel data transfer circuit TRC for transfer- 
ring the read values in parallel to a processing circuit 
group; a data memory DM for storing a filter coeffi- 
cient; and the grouped processing circuits PE1, PE2, 

, and PEn for multiplying/summing operations in 30 

parallel. The operations of the present embodiment 
will be described in the following with reference to Fig. 
1. 

First of all, the input images composed of data of 
P bits are serially inputted to the serial access mem- 35 
ory SAM1. The input images are written, when the 
pixel values F11, F12, — , and F1k of their first line 
are stored, in parallel in a word line W1 of the two-di- 
mensional memory array MAR. Subsequently, the 
pixel values of the second and third lines of the input 40 
images are likewise written in word lines W2 and W3 
each time they are stored in the serial access memory 
SAM1. Now, the data of the three lines necessary for 
calculating the pixel values of the frame of the output 
image are prepared in the two-dimensional memory 45 
array MAR. At this time, the correspondences be- 
tween the frame of the input image and the data on 
the word lines of the two-dimensional memory array 
MAR are shown at the lower lefthand of Fig. 1. 

While the data of the next line are being written so 
in the serial access memory SAM1, the values R11, 
R12, — , and R1k of the pixels of the second line of 
the frame of the output image are calculated in paral- 
lel. At this time, the control of the parallel data transfer 
circuit is executed by nine processing cycles, as 55 
shown at the lower righthand of Fig. 1 . First of all, in 
the first cycle, the input images of one line, which are 
stored in the word line W of the two-dimen sional 



memory array MAR, are read out and are latched 
through a data line group DG in the sense amplifier 
SA. Here is turned ON one L of the switches L, C and 
R of a selector SEL composing the parallel data trans- 
fer circuit TRC. As a result, there are transferred 
through the parallel data transfer circuit TRC the input 
pixel F11 to the processing circuit PE1, the input pixel 
F12 to the processing circuit PE2, — , and the input 
pixel F1k-2 to the processing circuit PEn. Simultane- 
ously with this, a weighting coefficient C-1- 1 is read 
out from the data memory DM and is multiplied by the 
input pixel which is inputted to the processing circuit. 
Subsequently, in the second cycle, the switch C in the 
selector SEL is turned ON to input through the parallel 
data transfer circuit the input pixel F1 2 to the process- 
ing circuit PE1, the input pixel F13 to the processing 
circuit PE2, - - and the input pixel F1 k-1 to the proc- 
essing circuit PEn, and these input pixels are multi- 
plied by the weighting coefficientC-10. In the third cy- 
cle, the switch R in the selector SEL is turned ON to 
likewise input through the parallel data transfer circuit 
the input pixel F13 to the processing circuit PE1, the 
input pixel F14 to the processing circuit PE2, — , and 
the input pixel F1k to the processing circuit PEn, and 
these input pixels are multiplied by the weighting 
coefficient C-11. After the input images stored in the 
word line W1 of the two-dimensional memory array 
MAR have thus been used, the word line W2 is then 
selected to read out and latch the input images of one 
line in the sense amplifier. In the fourth cycle, more- 
over, the switch L in the selector SEL is turned ON to 
transfer the input pixel F21 to the processing circuit 
PE1 , the input pixel F22 to the processing circuit PE2, 
— , and the input pixel F2k-2 to the processing circuit 
PEn. These input pixels are multiplied by a weighting 
coefficient CO-1 , and the products are added to the 
previously calculated values. Subsequently, in the 
fifth cycle, the switch C in the selector SEL is turned 
ON to input the input pixel F22 to the processing cir- 
cuit PE1 , the input pixel F23 to the processing circuit 
PE2, — , and the input pixel F2k-1 to the processing 
circuit PEn. These input pixels are multiplied by the 
weighting coefficient COO, and the products are add- 
ed to the previously calculated values. Likewise, in 
the sixth cycle, the switch R in the selector SEL is 
turned ON to input the input pixel F23 to the process- 
ing circuit PE1, the input pixel F24 to the processing 
circuit PE2, — , and the input pixel F2k to the proc- 
essing, circuit PEn. These input pixels are multiplied 
by the weighting coefficient C01, and the products 
are added to the previously calculated values. If, 
moreover, similar calculations are executed in the 
seventh to ninth cycles by selecting the word line W3, 
the values R22, R23, - - -, and R2k1 of the pixels of 
the second line of the output frame are determined at 

the processing circuits PE1, PE2, , and PEn. 

These pixel values R22, R23, , and R2k-1 are 

transferred in parallel to the serial access memory 
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SAM2 and are sequentially outputted. Incidentally, 
the terminal pixel has no necessary input pixel and 
may be transferred as it is, as shown. In order to proc- 
ess of the output images of the subsequent one line, 
similar operations may be repeated. Specifically, 5 
when the pixel information of one line is stored in the 
serial access memory SAM1, it is transferred to that 
word line of the two-dimensional memory array, 
which was rewritten at the most preceding time, so 
that the output image of one line is processed while 10 
the pixel information of the subsequent line is being 
written in the serial access memory SAM1. Thus, ac- 
cording to the present embodiment, the 3x3 space fil- 
ter processing operations of a plurality of images on 
the same line of the output frame can be processed 15 
in parallel and on real time. The individual processing 
circuits may complete the processing operations and 
the data transfers for the time period in which the im- 
ages of one line are inputted. As a result, the time per- 
iod to be used for the processing operations is el on- 20 
gated more than that of the prior art, in which the proc- 
essing operation is carried out for each time period in 
which one pixel is inputted. In other words, the real 
time processing operation can be accomplished even 
in case the input pixels have a high frequency. 25 

As described above, moreover, the present em- 
bodiment can transfer the information, which is latch- 
ed in one sense amplifier, to different processing cir- 
cuits through the parallel data transfer circuit TRC. As 
a result, the two-dimensional filter or the convolution 30 
processing operation can be executed in parallel with- 
out moving or transferring the data latched in the 
sense amplifier during the processing operation be- 
tween the sense amplifiers or between the process- 
ing circuits. As a result, no excessive circuit is re- 35 
quired for the transferring operations between the 
sense amplifiers or between the processing circuits, 
so that a highly integrated low power consumption 
can be realized. In the present embodiment, as 
shown in Fig. 1, the processing circuits are arranged 40 
just be low the two-dimensional memory array MAR, 
As a result, the data transfer distance from the two- 
dimensional memory array to the processing circuits 
can be reduced to a very short constant distance. As 
a result, in addition to an advantage that the delay 45 
time period for the transfer is short, there can be at- 
tained another advantage that the processing circuits 
are less dispersed in between so that they can be 
easily synchronized. Since, moreover, the parallel 
data transfer circuit and the processing circuits are ar- so 
ranged adjacent to and just below the memory array, 
they can be highly integrated to suppress the power 
consumption accompanying the transfer of the pixel 
information. 

The embodiment of Fig. 1 was directed to a de- 55 
vice for the 3x3 filter calculations. In the parallel data 
transfer circuit, therefore, one processing circuit and 
the data lines for three pixels are connected, and the 



sense amplifier and the processing circuits are con- 
nected with an overlap of two pixels between the ad- 
joining processing circuits. As could be understood, in 
the embodiment of Fig. 1 , the processing operation of 
a filter having an arbitrary size of 3x3 or more can be 
accomplished by changing the construction of the 
parallel data transfer circuit and the memory array. 
Fig. 3 shows an embodiment exemplifying a con- 
struction of the processing device capable of proc- 
essing of a 5x5 filter. The present embodiment is 
modified from the embodiment of Fig. 1 such that the 
number of the word lines of the two-dimensional 
memory array MAR is increased to five and such that 
the overlap of the parallel data transfer circuit TRC is 
increased to four pixels. The selector SEL composing 
the parallel data transfer circuit TRC is exemplified of 
a 5:1 selector for selecting data of P bits from the data 
of 5P bits, and twenty five coefficients necessary for 
the 5x5 filters can be stored by increasing the capac- 
itance of the data memory. In the present em bod k 
ment, one processing circuit can fetch the data from 
the sense amplifier corresponding to the five pixels, 
and the adjoining processing circuits share the data 
lines of four pixels of the data line group. As a result, 
the processing operations of the 5x5 filters can be 
executed in parallel while selecting the word lines of 
the two-dimensional memory array sequentially as in 
the embodiment of Fig. 1. Incidentally, in the present 
embodiment, not only the processing operation of the 
5x5 filter can be executed, but also a 4x4 filter can be 
easily constructed, as could be easily understood, by 
using four of the five word lines and four of the five 
sets of wiring lines connected with one transfer circuit 
TRC. Likewise, it is possible to execute the process- 
ing operation of a 3x3 filter or a 2x2 filter. 

In the embodiments of Figs. 1 and 3, one proc- 
essing circuit may be arranged for P data liens if the 
value of the pixel is expressed by P bits. In case the 
pixel value is expressed in the accuracy of 8 bits, for 
example, the processing circuits may be arranged 
within the pitch of the eight data lines. It may, how- 
ever, be difficult to arrange the processing circuits in 
case the processing circuits have a large scale or in 
case the data lines of the two-dimensional memory 
array has a narrow pitch. 

In this case, there can be used an embodiment 
shown in Fig. 4. Fig. 4 presents one embodiment for 
loosening the layout pitch of the processing circuits 
more in the device of Fig. 1 for calculating the 3x3 fil- 
ter. In the present embodiment, the input images of 
one line, which are inputted to the serial access mem- 
ory SAM1, are transferred through the parallel data 
transfer circuit TRC1 composed of distributors DIS to 
a register RG1 having having a capacitance of three 
lines. As a result, the layout pitch of the processing cir- 
cuit is three times as large as that of the em bodiment 
of Fig. 1. In the embodiment of Fig. 1 , one processing 
circuit can transfer the data from the data lines of 
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three pixels, and the two transfer paths are overlap- 
ped in the adjoining processing circuits. In the present 
embodiment, on the contrary, one processing circuit 
can transfer data from the data lines of nine pixels, 
and the parallel data transfer circuit is constructed 5 
such that six data lines are shared between the ad- 
joining processing circuits. The operations of the 
present embodiment will be described in the following 
with reference to Fig. 4. 

First of all, when the input images of the first line 10 
are stored in the serial access memory SAM1, the 
switches L in all the distributors DIS are turned ON to 
write the input images in parallel in the register RG1. 
When the input images of the second line are then 
stored in the serial access memory SAM1, the 15 
switches C in all the distributors DIS are turned ON 
to write the input images in parallel in the register 
RG1 . When the input images of the third line are then 
stored in the serial access memory SAM1, the 
switches R in all the distributors DIS are turned ON 20 
to write the input images in parallel in the register 
RG1 . The images of the consecutive first, second and 
third lines, as thus written in the register RG1, are 
trans ferred in parallel from the register RG1 through 
the data line group DG to the register RG2. Then, 25 
there are prepared in the register RG2 the pixels of 
the input images for the three lines necessary for 
processing the output images of the second line. 
These data are transferred through the parallel data 
transfer circuit TRC2 to the processing circuit so that 30 
the values of the pixels of the second line of the output 
images are determined. Incidentally, the transfers / 
and processing operations of the data have to be exe- 
cuted while the input images of the fourth line are be- 
ing written in the serial access memory SAM1 . When 35 
the calculations of the pixel values of the second line 
of the output images are completed so that the input 
images of the fourth line are written in the serial ac- 
cess memory SAM1, the switch L of the distributor 
DIS is turned ON to rewrite one third of the content of 40 
the register RG1. Since, at this time, the images of the 
second, third and fourth lines of the input images are 
prepared in the register RG1, they are transferred in 
parallel from the register RG1 to the register RG2 to 
execute the processing operations of the output im- 45 
ages of the third line. If these operations are contin- 
ued each time the input images of one line are stored 
in the serial access memory SAM1, the processing 
operations of the 3x3 filter can be continuously exe- 
cuted in parallel. Incidentally, as to the aforemen- so 
tioned operations, how the processing operations are 
executed by sending the data from the register RG2 
to the processing circuits will be described with refer- 
ence to Figs. 5 and 6. 

Figs. 5A and 5B show an example of the con- 55 
struction of the parallel data transfer circuit TRC2 for 
the embodiment of Fig. 4. As shown in Fig. 5A, the 
parallel data transfer circuit TRC2 has its selectors 



SEL connected in two layers and individually fed with 
three control signals <t>Li, <j>Ci and <|>Ri. The selector is 
composed of three switches L, C and R, as shown at 
the lefthand side of Fig. 5B. When the switch L is 
turned ON by the control signal §U, a lefthand input 
signal INL is outputted; when the switch C is turned 
ON by the control signal <|>Ci, a central input signal 
INC is outputted; and when the switch R is turned ON 
by the control signal <|>Ri, a righthand input signal INR 
is outputted. These switches can be constructed by 
connecting MOS transistors in parallel, as shown at 
the righthand side of Fig. 5B. Fig. 5A shows the state, 
in which the input images of the first, second and third 
lines are transferred to the register RG2. In this state, 
as described above, the pixel data for processing the 
output images of the second line in parallel have to be 
transferred to the processing circuit. Fig. 6 illustrate 
the timings of the control signals for that necessity. In 
Fig. 6, letters 4>L1, 4>C1 and <|>R1, and <J>L2, 4>C2 and 
<|>R2 designate the control signals of the selector SEL 
composing the parallel data transfer circuit TRC2. 
Fig. 6 also shows which pixel data are outputted to the 
lefthand four outputs TNO0, TNOI, TN02 and TN03 
of the outputs of the parallel data transfer circuit 
TRC2 are outputted at the individual times. With the 
processing circuit PE1, as shown in Fig. 5A, there is 
connected the output TNOI of the parallel data trans- 
fer circuit TRC. As a result, it is found from Fig. 6 that 
the pixels F1 1 , F1 2, F1 3, F21 , F22, - - -, and so on and 
the 3x3 pixel data around the pixel F22 are inputted 
to the processing circuit PEL Likewise, the 3x3 pixel 
data around the pixel F23 are inputted to the process- 
ing circuit PE2, and the 3x3 pixel data around the pixel 
F24 are inputted to the processing circuit PE3. As a 
result, the output images of the two lines can be proc- 
essed in parallel by the processing circuits PE1 , PE2, 
PE3, — -, and so on. The processing operations of the 
output images on the third and later lines can be like- 
wise carried out. Incidentally, the 3x3 filter cannot be 
processed as to the lefthand end TNOO so that the 
output is made not through the processing circuit but 
as it is, as in Fig. 1. According to the embodiment 
shown in Figs. 4, 5 and 6, as described above, the lay- 
out pitch of the processing circuits are loosened, and 
the two-dimensional filter operations can be execut- 
ed in parallel for each lien of the output images. Inci- 
dentally, here is exemplified the 3x3 filter, the present 
invention can be easily expanded to the processing 
operations of a larger filter. 

Fig. 7 shows a second embodiment for loosen the 
layout pitch of the processing circuit more than that of 
the device of Fig. 1 for calculating the 3x3 filter. In Fig. 
4, the loose layout pitch is realized by arranging the 
same number of processing circuits as that of the de- 
vice of Fig. 1 over a layout width of three times. In the 
present embodiment, on the contrary, the layout pitch 
is loosened by reducing the number of processing cir- 
cuits to one third and by arranging the processing cir- 
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cuits within the same layout width as that of the em- 
bodiment of Fig. 1 . Figs. 8A and 8B show an example 
of the construction of the parallel data transfer circuit 
TRC 1 for the embodi ment of Fig. 7. Fig. 8A shows the 
state in which the pixel values F11, F12, — , and so 5 
on of the first line of the input images are transferred 
to the sense amplifier SA. The parallel data transfer 
circuit TRC1 is constructed of a kind of 5:1 selector 
SELfor selecting P bits from 5P bits, as shown in Fig. 
7. Fig. 8 A shows an embodiment in which the selector 10 
SEL of Fig. 7 is composed of a kind of 2:2 selector 
SEL2-1 for selecting P bits from 2P bits. The selec- 
tors are connected in three layers, and each selector 
SEL2-1 is fed with the two control signals <t»Li and (f>Ri. 
The selector SEL2-1 is composed of the two switches 1 5 
L and R, as shown at the lefthand side of Fig. 8B. The 
lefthand input signal INL is outputted when the switch 
L is turned ON by the control signal §L\, and the right- 
hand input signal INR is outputted when the switch R 
is turned ON by the control signal <|>Ri. These 20 
switches can be constructed by connecting the MOS 
transistors in parallel, as shown at the righthand side 
of Fig. 8B. 

The operations of the embodiment shown in 
Figs. 7 and 8 will be described in the following with ref- 25 
erence to Fig. 9. In Fig. 9, (<|>L1, <|>R1), (<)>L2 f <|>R2) and 
(4>L3, <|>R3) designate the individual control signals for 
the selectors SEL composing the parallel data trans- 
fer circuit TRC1 shown in Fig. 8. Fig. 9 illustrate the 
timings of the selections of the word lines and the 30 
aforementioned control signals, the pixel data to be 
outputted from the lefthand four TNOO, TN01, TN02 
and TN03 of the outputs of the parallel data transfer 
circuit TRC1, and the timings for turning ON the 
switches L, C and R of the distributor DIS in the par- 35 
allel data transfer circuit TRC2. In the present em- 
bodiment, since the number of the processing circuits 
is reduced to one third, the three consecutive output 
pixels are processed by one processing circuit First 
of all, the input images of the first line are stored in 40 
the serial access memory SAM1 and are then trans- 
ferred to the word line W1 of the two- dimensional 
memory array MAR. Likewise, the input images of the 
second and third lines are transferred to the word 
lines W2 and W3, and the output images of the sec- 45 
ond line are then started. The input images of the first 
line on the word line W1 are read out through the data 
line group DG, and their pixels F11, F12, F13, — , and 
so on are latched from the lefthand in the sense ampli- 
fier, as shown in Fig. 8A. After this, the control signal of 50 
the selector SEL in the parallel data transfer circuit 
TRC1 are switched, as indicated in the column of a cycle 
t1 of Fig. 9. Then, the pixels F11, F14, F17, - - and so 
on are respectively transferred through the outputs 
TN01, TN02 and TN03 of the parallel data transfer 55 
circuit TRC to the processing circuits PE1, PE2 and 
PE3. As a result, the pixels are multiplied in the mul- 
tipliers MT1, MT2, , and so on by the weighting 



coefficients read out from the data memory, and the 
resultant products are stored in the registers RG1, 
RG2, — , and so on. Subsequently, the control sig- 
nals of the selector SEL are switched, as indicated at 
the column of a cycle t2 in Fig. 9. Then, the pixels F12, 
F15, F18, — , and so on are respectively transferred 
to the processing circuits PE1, PE2 and PE3. These 
data are multiplied by the weighting coefficients. The 
products are added to the preceding result stored in 
the registers, and the sums are stored again in the 
registers. As indicated at the column of a cycle t3 in 
Fig. 9, moreover, the control signals are switched to 
transfer the pixels F13, F16, F19, - - and so on re- 
spectively to the processing circuits PE1, PE2 and 
PE3. These data are multiplied and added to the pre- 
ceding results. The results thus far made are written 
in the serial access memory SAM2 through the switch 
L of the distributor DIS in the parallel data transfer c 
ircuit TRC2 of Fig. 7. The data are intermittently writ- 
ten in the serial access memory SAM2. 

Next, while the input pixels of the first line are be- 
ing latched in the sense amplifiers, the data are trans- 
ferred, as indicated at cycles t4 to t6 in Fig. 9, and the 
processed results are intermittently written in the ser- 
ial access memory SAM2 by turning the switch C of 
the distributor DIS. 

Subsequently, while the input pixels of the first 
line being latched in the sense amplifier, the data are 
transferred, as indicated at cycles t7 to t9 in Fig. 9, 
and the processed results are intermittently written in 
the serial access memory SAM2 by turning the switch 
R of the distributor DIS. After this, the word line W2 
is selected to latch the input pixels of the second line 
in the sense amplifier, and similar processing opera- 
tions are carried out Here, at the starts of the cycles 
t1, t4 and t7, the results obtained by using the input 
pixels of the first line are fetched from the serial ac- 
cess memory SAM2 in the registers RG1, RG2, — , 
and so on shown in Fig. 7, and the newly obtained 
multiplied results are added to the fetched results. 
When similar operations are executed by selecting 
the word line W3 to latch the input pixels of the third 
line in the sense amplifier, all the values of the pixels 
of the output images of the second line are deter- 
mined in the serial access memory SAM2. If these op- 
erations are continued each time when the input pix- 
els of one line are stored in the serial access memory 
SAM1 , the processing operations of the 3x3 filter can 
be continuously carried out Like the embodiment of 
Fig. 4, the present embodiment can achieve an ad- 
vantage that the layout pitch of the processing circuit 
can be made three times as large as that of the em- 
bodiment of Fig. 1. The present embodiment can re- 
duce the number of processing circuits to one third, 
because one processing circuit performs the proc- 
essing operations of three consecutive output pixels, 
so that it is suitable for the case in which many proc- 
essing circuits cannot be integrated over one chip. In- 
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cidentally, in order to loosen the pitch of the process- 
ing circuit more, it is sufficient that one processing cir- 
cuit process three or more consecutive pixels, as 
could be easily understood. For this, as could also be 
easily understood, the transfer network may be so 5 
constructed that the data can be transferred to one 
processing circuit from the numerous sense amplifi- 
ers while leaving two transfer paths overlapped in the 
adjoining processing circuits. 

The embodiments thus far described with refer- 10 
ence to Figures including Fig. 9 exemplify the two-di- 
mensional linear filter. Thanks to these embodi- 
ments, the lines or edges in the image can be fast em- 
phasized or smoothed by changing the sizes and 
coefficients of the filters. By changing the functions 15 
of the processing circuit, moreover, the extraction of 
a specific pattern or the processing operations of a 
non-linear filter such as a median fi Iter can be execut- 
ed at a high speed. Moreover, the foregoing embodi- 
ments can naturally be utilized, if they process the 20 
outputs by using the information of neighboring cells 
two-dimensionally distributed, for processing a cellu- 
lar automaton or a neural network coupled to the 
neighboring neuron only. Incidentally, in the Figures 
for describing the aforementioned embodiments, the 25 
two- dimensional memory cell array is made to store 
only the data of the pixels of the number of lines nec- 
essary for the processing operations. By increasing 
the number of the word lines of the two-dimensional 
memory array, however, it is easy to store the pixel 30 
data of more lines. If the data of one frame are to be 
stored, for example, the embodiments can also be 
used as the so-called "frame memory". In this case, 
only a portion of the two-dimensional memory array 
is processed, whereas the remaining data are serially 35 
read out and outputted as they are so that only a por- 
tion of the screen can be processed by the filter or the 
like. Still moreover, the area to be processed can be 
easily moved merely by changing the control of the 
word lines. 40 

Here will be described an embodiment for detect- 
ing a moving vector as an example, in which the pres- 
ent invention is applied to another other than the filter. 
The detection of the moving vector is useful for the 
compressing/uncompressing a digital moving image. 45 
Because of a large amount to be processed, however, 
there is desired a device for detecting the moving vec- 
tor at a high speed. As well known in the art, the mov- 
ing vector is detected by dividing the input image into 
blocks composed of a plurality of pixels, by comparing so 
the individual blocks between a block positioned to 
correspond to a reference image and a plurality of 
blocks positioned in the neighborhood of the former 
to determine a block having the shortest distance, and 
by determining the coordinate difference from the 55 
block of the input image. 

Figs. 1 0 and 1 1 show an embodiment of a device 
for processing the moving vector of a moving image 



by applying the present invention. In order to simplify 
the description, it is assumed in the following that the 
block has a size of 3x3 pixels and that the search has 
a scope of two pixels in the vertical and horizon tal di- 
rections. However, the present embodiment should 
not be limited to those numerical values but can be 
easily expanded. Fig. 11 shows a construction of a 
minimum distance processing unit for determining the 
minimum of an inter-block distance, as determined in 
Fig. 10, to output the moving vector. Here will be de- 
scribed the construction and operations of the pres- 
ent embodiment. 

In the device of Fig. 1 0, a pixel REFx'y' of a ref- 
erence image to be used for comparison with the pixel 
Fxy of the input image is individually inputted on real 
time to the serial access memories SAM2 and SAM1 . 
After having been inputted to the serial access mem- 
ories, the pixel is transferred to two-dimensional buf- 
fer arrays BAF2 and BAF1 for three lines and further 
to the two-dimensional memory arrays MAR2 and 
MAR1 for the comparison. The two-dimensional 
memory array MAR2 can store the input images of 
three lines so that a block having a size of 3x3 pixels 
can be stored in one column. On the other hand, the 
two-dimensional memory array MAR1 can store input 
images of seven lines which include the vertical two 
lines in addition to the position corresponding to the 
block of the input images in the memory array MAR2. 
Incidentally, the input images to be inputted to the 
serial access memory SAM 2 are inputted with a delay 
of two lines from the reference image inputted to the 
access memory SAM1, so that the data are transfer- 
red from the access memories SAM1 and SAM2 re- 
spectively to the buffer arrays BAF1 and BAF2 and 
the memory arrays MAR1 and MAR2 each time when 
the data of one line are stored. As a result, the image 
in the memory array MAR1 has the vertical two lines 
in addition to the position corresponding to the block 
of the input image in the memory array MAR2. The 
two-dimensional buffer arrays BAF2 and BAF1 of 
three lines are provided for temporarily storing the 
data for determining the moving vector of the block of 
a next column while the moving vector of the block of 
one column is being determined. At the end of each 
processing operation of the moving vector of the 
block of one column, the data of those two-dimen- 
sional buffer arrays BAF1 and BAF2 are transferred 
to the memory arrays MAR1 and MAR2 so that the 
moving vector of the block of the next one column is 
processed. In order to determine the moving vector, 
as described above, it is necessary to calculate the 
distance between the block of the input image and the 
block of the reference image which is positionally 
shifted in the vertical and horizontal directions. The 
inter-block distance can be determined by summing 
the differences between the values of the pixels com- 
posing one block and the pixels composing another 
block. In the embodiment of Fig- 1 0, the distances be- 
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tween the pixels read out from the memory arrays 
MAR2 and MAR1 are calculated in parallel by the 
processing circuits PE1, — , and PEn. If the control 
signals <|>L, 4>C and <(>R of the parallel data transfer cir- 
cuit TRC2 are switched each time when the word lines 5 
of the memory array MAR2 are selected one by one, 
it is possible to transfer the pixels of different blocks 
for the processing circuits. On the other hand, the 
memory array MAR1 has the data of the reference im- 
age in excess of vertical two lines in addition to the 10 
position corresponding to the block of the input image 
in the memory array MAR2. By switching the word 
lines, therefore, the y coordinate of the pixels to be 
transferred can be changed within th vertical two pix- 
els in addition to the position corresponding to the 15 
block of the input image. By switching the control sig- 
nals of the parallel data transfer circuit TRC1, more- 
over, the pixels, which are also displaced in the x di- 
rection within the range of the totally seven pixels of 
the horizontal two pixels in addition to the position 20 
corresponding to the block of the input image, can be 
transferred to the individual processing circuits. As a 
result, the coordinates of the block of the reference 
image to be inputted to the processing circuits can be 
shifted within a range of the two pixels in the x and y 25 
directions with respect to the input image. Incidental- 
ly, the signal lines of the parallel data transfer circuit 
TRC1 are required to have an overlap of four lines, but 
the signal lines of the output TN1 need not have any 
overlap. 30 

The distance between the block of the input im- 
age and the block of the reference image is deter- 
mined in the following manner. First of all, the shift of 
the coordinates is fixed, and the pixels of the block of 
the input image and the block of the reference image 35 
are transferred to the individual processing circuits 
PE1 , — , and PEn. The distances between the pixels, 
as determined by the processing circuits, are trans- 
ferred to accumulators ACC1, — , and ACCn so that 
their values for one block are accumulated. The dis- 40 
tances between the blocks, as thus determined, are 
transferred to minimum distance processing units 
MINI, — , and MINn. These minimum distance proc- 
essing units determine such a shift of coordinates as 
minimizes the distances between the blocks. The 45 
construction of the minimum distance processing 
units is shown in Fig. 11 . The minimum distance proc- 
essing unit MINi is constructed, as shown in Fig. 11, 
to include a comparator COM, registers REG1 and 
REG2 and switches SWB1 and SWB2. The inter- so 
block distance BLDi{ Ax, Ay) for predetermined shifts 
Ax and Ay is inputted, when determined, to the com- 
parator COM. This comparator COM compares the 
newly determined inter-block distance BLDi( Ax, Ay) 
and the inter-block distance ( Ax\ Ay')of another 55 
shifts Ax' and Ay', as already determined and stored 
in the register REG1. If the result reveals that the dis- 
tance BLDi ( Ax, Ay) is smaller, the switch SWB1 is 



turned ON to update the content of the register REG1 
to the value BLDi( Ax, Ay). The register REG2 is stor- 
ed with the shift ( Ax\ Ay*), which is also updated to 
( Ax, Ay) when the switch SWB2 is turned ON. If the 
distance BLDi ( Ax, Ay) is larger, on the contrary, the 
switches SWB1 and SVVB2 are not turned ON so that 
the contents of the registers are nor updated. By exe- 
cuting the operations described above for all the 
shifts, the register REG2 determines the shift mini- 
mizing the inter-block distance, i.e., a moving vector 
MC. In Fig. 10, the moving vectors of the blocks of one 
column are determined in parallel so that they are 
transferred to the serial access memory SAM3 and 
sequentially outputted to the outside of the chip. 

As has been described hereinbefore, according 
to the embodiment of Figs. 10 and 11, it is possible to 
determine the moving vectors of the blocks of one col- 
umn in parallel for the image which is inputted on real 
time. As a result, the moving image compressing/ un- 
compressing system making use of the moving vector 
is enabled to execute fast processing operations by 
mounting the semiconductor integrated circuit of the 
present embodiment on the system. Incidentally, the 
construction of Fig. 10 can naturally loosen the pitch 
of the processing circuits by the method of Figs. 4 and 
7. 

The embodiments according to the present in- 
vention have been described hereinbefore. These 
embodiments have used the two-dimensional mem- 
ory arrays which have the word lines capable of stor- 
ing the pixel data of one or more lines. If the word lines 
are excessively long, however, the wiring capacitance 
and resistance may increase to make it difficult to ef- 
fect a fast drive. In this case, the arrays may be div- 
ided. If, however, a simple division is made in that 
case, the pixels necessary for the processing circuit 
arranged at the end of the sub-array are present in an 
adjacent sub-array to make it necessary to provide an 
access path especially. In order to avoid this, the pixel 
data at the end of the sub-array may be doubly owned 
by the adjoining sub-arrays. In the Figures for explain- 
ing the embodiments, moreover, the detailed con- 
struction of the two-dimensional memories array or 
the method of producing the control signals is not 
omitted but can be easily made by the technique used 
in the ordinary LSI. For example, the two-dimension- 
al memory array can be exemplified by a DRAM array 
made of a single transistor cell. Since, in this case, the 
two-dimensional memory array can be constructed in 
high integration, a larger number of processing cir- 
cuits can be integrated in the same chip size than that 
of the construction using a SRAM array or the like. As 
a result, a faster processing operation can be accom- 
plished. Incidentally, as has been described herein- 
before, most of the embodiments of the present in- 
vention use all the information of the memory array 
for a short time period. As a result, even in case the 
DRAM array is used, an automatic refreshing is ef- 
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fected during the processing operation. This raises an 
advantage that the refreshing need not be accom- 
plished by interrupting the processing operation. 

According to the semiconductor integrated circuit 
of the present invention, the processing operations 5 
using the two-dimensional data such as the two-di- 
mensional space filter, the convolution processing 
operation, or the processing operation for searching 
the moving vector between the images can be exe- 
cuted in parallel. As a result, these processing oper- 10 
ations can be executed at a high speed on real time. 

Claims 

15 

1. A semiconductor integrated circuit comprising: a 
memory cell array including a plurality of data 
lines, a plurality of word lines intersecting said 
plurality of data lines, and a plurality of memory 
cells disposed at desired intersections between 20 
said plurality of data lines and said plurality of 
word lines; a parallel data transfer circuit for 
transferring a plurality of data in parallel from said 
plurality of data lines; and a plurality of process- 
ing circuits for receiving said plurality of data 25 
transferred from said parallel data transfer cir- 
cuit, as their input signals, 

wherein said parallel data transfer circuit is 
enabled to transfer two or more of said plurality 
of data to the individual ones of said plurality of 30 
processing circuits by sequentially selecting and 
selecting two or more of said plurality of data 
lines with the individual ones of said plurality of 
processing circuits, and wherein the adjoining 
ones of said plurality of processing circuits can in- 35 
put the same data from the same data lines. 

2. A semiconductor integrated circuit according to 
claim 1 , wherein the individual ones of said plur- 
ality of processing circuits execute the process- 40 
ing operations by using the plurality of data which 

are read out to one of said plurality of data lines 
by selecting two or more of said plurality of word 
lines. 

45 

3. A semiconductor integrated circuit according to 
claim 1, further comprising: a first serial access 
memory for storing serial data inputted from the 
outside and outputting said serial data in parallel 

to said plurality of data lines; and a second serial 50 
access memory for transforming the output data 
of said plurality of processing circuits into serial 
data and outputting said serial data to the out- 
side. 

55 

4. A semiconductor integrated circuit according to 
claim 2, further comprising; a first serial access 
memory for storing serial data inputted from the 



outside and outputting said serial data in parallel 
to said plurality of data lines; and a second serial 
access memory for transforming the output data 
of said plurality of processing circuits into serial 
data and outputting said serial data to the out- 
side. 

5. A semiconductor integrated circuit according to 
claim 1, wherein each of said plurality of process- 
ing circuits executes the processing operation by 
using said plurality of data from said memory cell 
array and a predetermined constant. 

6. A semiconductor integrated circuit according to 
claim 2, wherein each of said plurality of process- 
ing circuits executes the processing operation by 
using said plurality of data from said memory cell 
array and a predetermined constant. 

7. A semiconductor integrated circuit according to 
claim 3, wherein each of said plurality of process- 
ing circuits executes the processing operation by 
using said plurality of data from said memory cell 
array and a predetermined constant. 

8. A semiconductor integrated circuit comprising: a 
memory cell array including a plurality of data line 
groups, a plurality of word lines intersecting said 
plurality of data line groups, and a plurality of 
memory cells disposed at desired intersections 
between said plurality of data line groups and 
said plurality of word lines; a parallel data transfer 
circuit for transferring a plurality of data groups in 
parallel from said plurality of data line groups; 
and a plurality of processing circuits for receiving 
said plurality of data groups transferred from said 
parallel data transfer circuit, as their input signals, 

wherein said parallel data transfer circuit is 
enabled to transfer two or more of said plurality 
of data groups to the individual ones of said plur- 
ality of processing circuits by sequentially select- 
ing and selecting two or more of said plurality of 
data line groups with the individual ones of said 
plurality of processing circuits, and wherein the 
adjoining ones of said plurality of processing cir- 
cuits can input the same data group from the 
same data line groups. 

9. A semiconductor integrated circuit according to 
claim 8, wherein the individual ones of said plur- 
ality of processing circuits execute the process- 
ing operations by using the plurality of data 
groups which are read out to one of said plurality 
of data line groups by selecting two or more of 
said plurality of word lines. 

10. A semiconductor integrated circuit according to 
claim 8, further comprising: a first serial access 
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memory for storing serial data inputted from the 
out side and outputting said serial data in parallel 
to said plurality of data line groups; and a second 
serial access memory for transforming the output 
data of said plurality of processing circuits into 5 
serial data and outputting said serial data to the 
outside. 

11. A semiconductor integrated circuit according to 
claim 9, further comprising: a first serial access 10 
memory for storing serial data inputted from the 
outside and outputting said serial data in parallel 

to said plurality of data line groups; and a second 
serial access memory for transforming the output 
data of said plurality of processing circuits into 15 
serial data and outputting said serial data to the 
outside. 

12. A semiconductor integrated circuit according to 
claim 8, wherein each of said plurality of process- 20 
ing circuits executes the processing operation by 
using said plurality of data groups from said 
memory cell array and a predetermined constant 

13. A semiconductor integrated circuit according to 25 
claim 9, wherein each of said plurality of process- 
ing circuits executes the processing operation by 
using said plurality of data groups from said 
memory cell array and a predetermined constant. 

30 

14. A semiconductor integrated circuit according to 
claim 10, wherein each of said plurality of proc- 
essing circuits executes the processing opera- 
tion by using said plurality of data groups from 
said memory cell array and a predetermined con- 35 
stant. 

15. A semiconductor integrated circuit comprising: 
first and second memory cell arrays including a 
plurality of data lines, a plurality of word lines in- 40 
tersecting said plurality of data lines, and a plur- 
ality of memory cells disposed at desired inter- 
sections between said plurality of data lines and 

said plurality of word lines; a first parallel data 
transfer circuit for transferring a plurality of first 45 
data in parallel from said plurality of data lines of 
said first memory cell array; a second parallel 
data transfer circuit for transferring a plurality of 
second data in parallel from said plurality of data 
lines of said second memory cell array; and a 50 
plurality of processing circuits for receiving said 
plurality of first and second data transferred from 
said first and second parallel data transfer cir- 
cuits, as their input signals, 

wherein said first parallel data transfer cir- 55 
cuit is enabled to transfer two or more of said plur- 
ality of first data to the individual ones of said 
plurality of processing circuits by sequentially se- 



lecting and selecting two or more of said plurality 
of first data lines with the individual ones of said 
plurality of processing circuits, wherein the ad- 
joining ones of said plurality of processing circuits 
can input the same data from the same data lines, 
wherein said second parallel data transfer 
circuit is enabled to transfer two or more of said 
plurality of second data to the individual ones of 
said plurality of processing circuits by sequential- 
ly selecting and selecting two or more of said 
plurality of second data lines with the individual 
ones of said plurality of processing circuits, and 
wherein the adjoining ones of said plurality of 
processing circuits can input the same data from 
the same data lines. 

16. A semiconductor integrated circuit comprising: 
first and second memory cell arrays including a 
plurality of data line groups, a plurality of word 
lines intersecting said plurality of data line 
groups, and a plurality of memory cells disposed 
at desired intersections between said plurality of 
data line groups and said plurality of word lines; 
a first parallel data transfer circuit for transferring 
a plurality of first data groups in parallel from said 
plurality of data line groups of said first memory 
cell array; a second parallel data transfer circuit 
for transferring a plurality of second data groups 
in parallel from said plurality of data line groups 
of said second memory cell array, and a plurality 
of processing circuits for receiving said plurality 
of first and second data groups transferred from 
said first and second parallel data transfer cir- 
cuits, as their input signals, 

wherein said first parallel data transfer cir- 
cuit is enabled to transfer two or more of said plur- 
ality of first data groups to the individual ones of 
said plurality of processing circuits by sequential- 
ly selecting and selecting two or more of said 
plurality of first data line groups with the individ- 
ual ones of said plurality of processing circuits, 
wherein the adjoining ones of said plurality of 
processing circuits can input the same data 
group from the same data line groups, 

wherein said second parallel data transfer 
circuit is enabled to transfer two or more of said 
plurality of second data groups to the individual 
ones of said plurality of processing circuits by se- 
quentially selecting and selecting two or more of 
said plurality of second data line groups with the 
individual ones of said plurality of processing cir- 
cuits, and wherein the adjoining ones of said plur- 
ality of processing circuits can input the same 
data group from the same data line groups. 
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FIG. 2A Block Diagram 
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FIG. 3 
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FIG. 4 
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FIG. 5 A Transfer network 
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FIG. 6 
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FIG. 7 
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FIG. 8A Parallel data transfer circuit 
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FIG. 10 
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